Selecting Feature Sets for Comparative and Time-Oriented Quality Estimation of Machine Translation Output
نویسندگان
چکیده
This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with blackbox features originating from PCFG Parsing, language models and various counts. Post-editing time prediction uses regression models, additionally fed with new elaborate features from the Statistical MT decoding process. These seem to be better indicators of post-editing time than blackbox features. Prior to training the models, feature scoring with ReliefF and Information Gain is used to choose feature sets of decent size and avoid computational complexity.
منابع مشابه
Machine learning methods for comparative and time-oriented Quality Estimation of Machine Translation output
This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with blackbox features originating from PCFG Parsing, language models and various counts. Post-editing time prediction uses regression models, additionally fed with new el...
متن کاملتخمین اطمینان خروجی ترجمه ماشینی با استفاده از ویژگی های جدید ساختاری و محتوایی
Despite machine translation (MT) wide suc-cess over last years, this technology is still not able to exactly translate text so that except for some language pairs in certain domains, post editing its output may take longer time than human translation. Nevertheless by having an estimation of the output quality, users can manage imperfection of this tech-nology. It means we need to estimate the c...
متن کاملOn the Translation Quality of Google Translate: With a Concentration on Adjectives
Translation, whose first traces date back at least to 3000 BC (Newmark, 1988), has always been considered time-consuming and labor-consuming. In view of this, experts have made numerous efforts to develop some mechanical systems which can reduce part of this time and labor. The advancement of computers in the second half of the twentieth century paved the ground for the invention of machine tra...
متن کاملImprovement of effort estimation accuracy in software projects using a feature selection approach
In recent years, utilization of feature selection techniques has become an essential requirement for processing and model construction in different scientific areas. In the field of software project effort estimation, the need to apply dimensionality reduction and feature selection methods has become an inevitable demand. The high volumes of data, costs, and time necessary for gathering data , ...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013